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SUMMARY 


The performance of a linear multigrid method using four smoothing methods, called SCGS, CLGS, 
SILU and CILU, is investigated for the incompressible Navier-Stokes equations in general coordinates, 
in association with Galerkin coarse grid approximation. Robustness and efficiency are measured and 
compared by application to test problems. The numerical results show that CILU is the most robust, 
SILU the least, with CLGS and SCGS in between. CLGS is the best in efficiency, SCGS and CILU 
follow, and SILU is the worst. 


INTRODUCTION 


Robustness and efficiency of a multigrid method are strongly influenced by the smoother used. 
Because there are so many factors influencing robustness and efficiency, it is hard to say in general 
which method is the most appropriate choice for certain applications. In this paper, we study four 
smoothing methods for the incompressible Navier-Stokes equations in general coordinates, namely the 
SCGS (Symmetrical Coupled GauB-Seidel [18]), CLGS (Collective Line GauB-Seidel, adapted from 
SCAL [16]), SILU (Scalar ILU, or TILU in [23]) and CILU (Collective ILU [30]), respectively, which 
are used in a linear multigrid method. Galerkin coarse grid approximation (GCA) is used. An 
elementary introduction to GCA can be found in [21]. Application to the Navier-Stokes equations is 
discussed in [29] and [31]. 

The multigrid method using the above four smoothers solves the velocity and the pressure 
simultaneously (collectively). Decoupled solution is also used in practice, solving the velocity and the 
pressure separately. A comparison is given in [1] of multigrid methods using coupled solution with 
SCGS and CLSOR (Coupled Line Successive Over Relaxation) smoothing and multigrid methods using 
decoupled solution. Comparisons are presented in [13] and [14] for multigrid methods using the SCGS 
method and methods using the uncoupled MGPC method (Multigrid Pressure Correction) and the SPC 
(SIMPLE Pressure Correction) smoothing methods by means of local Fourier analysis as well as 
numerical experiments. It is stated in [17] that it is advantageous to use the coupled approach. 
However, both coupled and decoupled solution methods are widely used in practice. 
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The comparisons mentioned above are made for nonlinear multigrid methods in which the coarse 
grid operators are computed by using discretization coarse grid approximation (DCA). The relative 
merits of DCA and GCA are discussed in [21]. A nonlinear multigrid using DCA for the applications 
discussed in the present paper is presented in [8], [9], [10]. Here we apply GCA. Our main reason is that 
discretization of the Navier- Stokes equations in general coordinates on a staggered grid is a complicated 
affair, and GCA enables us to completely separate discretization and multigrid solution. In this paper, 
attention is focussed on smoothing. 


EQUATIONS AND DISCRETIZATION 


The incompressible Navier-Stokes equations in tensor formulation in general curvilinear coordinates 
are given as follows: 

dU° 

~gf + + g^p, e - .Re- 1 (sfUZ + fl/},) y = B°, ( 1 ) 

U % = 0 - ( 2 ) 

Here U a , a = 1, 2, - • • , d, are the contravariant velocity components with d the number of space 
dimensions, p is the pressure, t is the time, B a is the contravariant component of the body force, and g a ^ 
is the metric tensor. About tensor notation, see [2] for more details. U°q is the contravariant derivative. 
Readers not familar with tensor analysis can understand what is going on by assuming that Cartesian 
coordinates are used, and interpreting U a p as dU a /dx p . U° and B a are defined by U a =a° • u, 

B a = a Q ■ b, where u and b are the physical velocity vector and the body force, respectively, and a Q is 
the contravariant base vector of the general coordinate system. Let x = (x 1 , x 2 , ■ ■ • ,x d ) be a Cartesian 
coordinate system and £ = (£\£ 2 , ■ • • ,f d ) be a general coordinate system. Then the contravariant base 
vector a Q is defined as a° = grad(£ Q ), and the metric tensor g a & is defined by g af} = a Q • sP. It is found 
that to achieve better accuracy the variable V a - y/gU a should be used instead of U* ([7], [12], [22]), 
with y/~g the Jacobian of the transformation x i — > £ : yj~g = \dx a /d(?\. 

A finite volume discretization of equations (1) and (2) is presented in [7], [12], [22] on staggered 
grids in general coordinates. From now on we concentrate on two dimensions. Cells may be indexed by 
a two-tuple of integers i = (Mj) € Q, Q = {1,2, • . . ,/} x {1, 2, • . • , J>, with / and J the number of 
cells in the and the ^-direction. The index system for discrete variables is defined as follows. The 
V variable at the center of the left face, the V 2 variable at the center of the lower face and the p 
variable at the center of a cell have the same index as the cell. Cells can be numbered in many ways. 

But unless indicated otherwise, we use the lexicographic order. Variables can also be numbered in 
different ways, for example, block wise ordering. We use blockwise ordering for representation of 
equations; orderings used in the smoothers may be different and are specified together wifi the 
smoothers. In blockwise ordering, V 1 , V 2 and p are ordered separately: 

i^k > Vk+ 1 > ' ' ' iPkiPk+u ’ ’ ')■ V = (V 1 , V 2 ), B = (f? 1 , B 2 ) and p represent the 
discrete velocity, the body force and the pressure gnd functions, respectively. The discretization results 
in the following discrete system: 


_i_V n+1 + 0Q'(V n+1 ) + 0Gp n+1 — fv'(n+l) 
DV nl = f«=(n+l) 


( 3 ) 
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with 


( 4 ) 


P'(»+i) = $B n+l + (1 - 9) B n + ^V n - (1 - 0)Q'(V n ) - (1 - 0)Gp n , 

fc(n+l) _ Q 

The superscript n denotes the time level. The parameter 9 is in [0,1]. The backward Euler method is 
obtained by setting 9 = 1, which is the method used in our numerical experiments. Note that (3) is 
nonlinear, and is to be solved by a linear multigrid method. Therefore it should be linearized outside 
multigrid iterations at each time level. Linearization with Newton’s method gives 

(u a u 0 ) n+ 1 = {u a ) n+l (u p ) n + (u°) n (u p ) n+1 - (s) 

So we have 

Q'(V n+1 ) = Q a V n+1 + Q 2 (V n ) (6) 

with both Qi and Q 2 evaluated at time level n. Note that Qi is linear. As a consequence, using 
blockwise ordering, the linear system to be solved at each time level can be written as 

Kx = f, ( 7 ) 


where 



with 

Q = — + 0Q U r (n+1) = f v,{n+1) - 0Q 2 (V n ). 
A stationary solution is reached if 


is satisfied, where 



( 8 ) 

( 9 ) 

( 10 ) 

( 11 ) 


THE SMOOTHING METHODS 


In this section, the four smoothing methods to be used, SCGS, CLGS, SILU and CILU, are 
described briefly. SCGS is of collective point GauB-Seidel type. It is a well-known fact that 
GauB-Seidel smoothing is not robust when cells in physical space are stretched, which occurs often in 
general boundary fitted coordinates. Line smoothers are better than point smoothers in handling such 
problems. Based on the idea of SCGS, a line version called SCAL is presented in [16]. Successful 
applications of the SCGS and the SCAL methods to problems in Cartesian coordinates can be found in, 
for instance, [4], [15], [16], [18]. Satisfactory results are also reported for problems in general 
coordinates ([8], [9], [10], [1 1]). The results show that SCAL seems to be more attractive than SCGS. 
Good smoothers may also be derived by employing ILU factorization. For a survey of ILU smoothers, 
see [20]. Two versions of ILU smoothers, called SILU and CILU, for the incompressible Navier-Stokes 
equations are presented in [23] and [30], respectively. 
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The SCGS Method 


The SCGS method updates variables cell by cell in a smoothing sweep, first in lexicographical order 
and then in backward lexicographical order. The five variables, say for cell i € Q, 

Vi i K+ei > K 2 > K+e 2 > Pi' which are located at the centers of the four cell faces and the center of cell i, 
are updated simultaneously, with e\ = (1, 0), = (0, 1). For convenience, we introduce eo = (0, 0). 

Let the array y contain the above five variables, and let the local system for the correction by of y be 
given by 

A<5y = c, (12) 

with c T = (cf , cf £] , cf , cf +ei . cf ) and A a 5 x 5 matrix. The local system is formulated as follows. 
Equation (8) can be written, in more detail, as 



/ Q 11 Q 12 dG 1 > 


V 1 \ 

/ fvl 

K = 

[ Q 21 Q 22 6G 2 

, x = 

V 2 , f = 

fv2 


{ D 1 D 2 0 ) 


Ip/ 

(P 


(13) 


c contains the residuals of the five equations corresponding to the five variables and is computed by 

cf = (P 1 - CTV 1 - Q 12 V 2 - G’p)*, cf +ei = (P 1 — Q n V J - Q 12 V 2 — G 1 p) i+Cl , 
cf = (P 2 - Q^V 1 - Q 22 V 2 - G 2 p)i, cf e2 = (P 2 — Q 21 V 1 — Q 22 V 2 — G 2 p)i+ e2 > (14) 

cf = (f c - D 1 ^ - D 2 V 2 )i. 

Using stencil notation ([21]), A can be written as 


( Q n (*,e 0 ) 


A = 


Q”(* + Cl) Co) 


GHi.co) ^ 
G z (i + ei, — ei) 

Q 22 ^, e o) G 2 (z, eo) 

Q 22 (i + e2, eo) G 2 (z + e2, — e2) 
D 2 (i,e 2 ) 0 


\ D^i.co) DHt.c,) D 2 (z, e 0 ) 

Equation (15) is solved analytically. The correction 6y is added immediately to y: 


(15) 


y:=y + u)6y, (16) 

where u is an underrelaxation factor. 


The CLGS Method 


The CLGS method is in fact the same as the SCAL method proposed in [16], except that a 
smoothing sweep is composed of line GauB-Seidel in CLGS instead of alternating zebra in SCAL. So 
CLGS updates variables line by line successively. Let the vector y accommodate the variables for a 
whole horizontal z 2 - line of cells: 


y T = (■ .. V } V 2 V 2 , V V 1 V 2 V 2 n , 

J V ? v i i y i J K t+ei i v x -fci ) K t-j-C]+e 2 > 

^i+2ei J K+2 ci J ^i+2ej+c 2 ) Pi+2ci *’’*)> ^ = (^1 > ^ 2 ) € 


(17) 
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Updating y gives horizontal line GauB-Seidel smoothing. Similarly, if the line is taken vertical for a 
fixed ii, we have the following arrangement of variables: 

y T = (•••, »?,*?, Ift.,, p i,V? + , a ,V> + ,„V^. lW Pi+'„ 

K+2c2»^+2e2»^i+2c2+ei>P*+2e2 J ’ ' ')> ^ “ (^1)^) ^ Q » 

which gives vertical line GauB-Seidel smoothing. We will use forward horizontal line smoothing, unless 
indicated otherwise. Other types of line smoothings can be constructed easily by changing the sequence 
of visiting lines. Let the equation system for the correction 6y of y be denoted again by equation (12), 
which is readily derived from equation (7), with c the residuals of the equations corresponding to the 
variables in y. For a line, for example with ii fixed, the variables having the same i\ are grouped 
together to form a 4-vector (V? ,V?,V? +e2 ,p t ). This collective arrangement of variables results in a 
4x4 block matrix representation of the matrix A, which has non-zero elements (4x4 matrices) at 
positions (»!,*! ± Mi - 2) in the ^-th row of A. Solution of equation (12) can be carried out easily by 
using block LU factorization, which needs no further discussion. Updating is performed by (16). 
Because variables are collectively updated and line GauB-Seidel relaxation is employed, this method is 
called CLGS. 


The SILU Method 


The SILU method is constructed as follows. Because K in (7) is indefinite, it is hard to find a 
regular splitting ([19]) 

K = M - N (19) 

such that the classical iteration 

x <+1 = x i - M'^Kx* - b) (20) 

converges. Therefore, an r-transformation K is used ([23], [24], [25], [26]), and a regular splitting 


KK = M - N 

is easier to find. Equation (21) corresponds to the following splitting of K: 

K = MK 1 - NK 1 . 

So with underrelaxation, the iteration (20) is revised as 

x i+1 = x< - wKM" 1 (Kx’ - b). 

The matrix K chosen and the product KK are given by 

( I -Q -1 GE _1 F \ ^ _ / Q 0 

K= (o E-'F j’ KK -(D -F 

with E = DQ -1 G and F = DG. Since K involves the computation of Q -1 and E _1 , which is not 
practical, the following approximation K of K is applied: 


( 21 ) 

( 22 ) 

(23) 


( 24 ) 


K = 


I -G 
0 F -1 DQ _1 G j ’ 


( 25 ) 
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( 26 ) 


where F = diag(DG) and Q = diag(Q). Hence we use: 

x <+1 = x* - u/KM _ 1 (Kx* - b). 

Note that the approximation K of K is different from that used in [23], which is 



(27) 


because we found with (27) the multigrid method does not work. The ILU factorization of KK uses a 
nine-point ILU factorization, in which the ordering of variables is nested, that is, 

, VH,Vk,Pk, H+i> Yk+uPk+it ’ ■ •)• So equation (21) can be rewritten as 

KK = (L + D)D -1 (D + U) - N (28) 

with L and U strictly lower and upper triangular matrices and D a diagonal matrix. For the k-th row of 
the matrix M, the non-zero pattern G for incomplete factorization is chosen to be a nine-point pattern 
G = (k, k ± 3, k ± 3/, k ± 3/ ± 3, k ± I 3) with I the number of cells in the ^-direction, and the 
elements of M in G are chosen to be equal to the corresponding elements of KK. In this paper, this 
method is referred to as SILU because it works with scalar elements of matrices and to distinguish it 
from CILU, which works with block elements (here 3x3 matrices) and is explained now. 


The CILU Method 


CILU differs from SILU in two 
unknowns. The r-transformation K 



aspects: the choice of r-transformation and a collective treatment of 
and the corresponding KK are given by 


-Q 

Cl 


KK = 


Q (C-i)g\ 

D -DQ -1 G ) 


(29) 


Note that a parameter ( is introduced. It is observed that £ sometimes has significant effect on 
convergence (cf. [30]), but here for simplicity it is fixed at 2, which is found to be a good compromise 
for different problems. Obviously, K and KK both should be approximated since the computation of 
Q -1 is impracticable. They are approximated by: 


K = 




KK = 


Q (C-i)G \ 
D -DQ-'G j ’ 


respectively. KK is approximately factorized as follows: 

KK = M- N=(L + D)D -1 (D + U) - n. 


(30) 


(31) 


Similar to CLGS, variables are grouped together. For cell i, three variables having the same cell index 
are grouped in a 3-vector (V^ 1 , V?,Pi). Of course, this corresponds to nested ordering. This collective 
treatment of variables leads to a 3 x 3 block matrix representation of KK. The ILU factorization works 
with the 3x3 blocks as elements. Because of the collective treatment, we call the resulting ILU method 
CILU. In a typical row, for example row k, KK has non-zero elements (3x3 matrices) at positions 
(k, k ± 1, k ± I, k ± I ± 1, k ± I T 1, k — 2, k + I — 2, k — 21, k — 21 + 1). We choose the following 
non-zero pattern G = (fc, k ± 1, k ± I, k ± I ± 1, k ± I T 1) for the approximate factorization. 
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THE LINEAR MULTIGRID ALGORITHM 


The linear multigrid algorithm solves the linearized equation system (7) at each time step. The 
F-cvcle will be used. The number of pre-smoothings and the number of post-smoothings are . 

The coarsest grid will be as coarse as possible; the coarsest grid is 2 x 2 in our cases. A direct solver is 
applied on the coarsest grid. The transfer operators for prolongation may be different m the computation 
of coarse grid matrices by means of Galerkin coarse grid approximation and m the computation o 
coarse grid correction. For the computation of coarse grid matrices, the prolongation operators for the 
velocities in the momentum equations are bilinear interpolation, but the hybnd interpolation [28] is used 
in the continuity equation in order to preserve the structure of the matrix on every gnd. The 
prolongation operator for the pressure is a piecewise constant interpolation. For completeness, we 
describe the hybrid interpolation here. Cell-centered coarsening is used, taking umons of four fine gn 
cells to form a coarse grid cell, as illustrated in figure 1. The correspondence between the numbering of 


- 

-2 i - 

- 

- 


Figure 1. A cell of Q 1 and the corresponding four ceUs of Q\ the grid points are indicated by — . 


the variables V 1 C U : G 1 ' — ♦ Tl on the coarse grid Q l and of V 1 C U : Q l * + K on the fine gnd Q 

is also presented in this figure; coarse grid quantities are indicated by an overbar. The hybnd 
interpolation P 1 : U 1 ■ — > U 1 is constructed by using linear interpolation in the £ -direction but zeroth 

order interpolation in the £ 2 -direction: 


we 2 we 
we 2 we 


where P 1 * is the adjoint of P 1 (cf. [21] for this way of specifying a prolongation). Here w - 0 when 
the “west” point refers to a point outside domain and w - 1 elsewhere, and similarly for e relative to 
“east” points. The underlined element indicates that the corresponding point has index 2% on the tine 
grid, if the operator is applied to point i on the coarse grid. The hybrid interpolation P for V is 
constructed similarly. Coarse grid correction is computed by using bilinear interpolation for the 
velocities and piecewise constant interpolation for the pressure. The restriction operators use the adjoin 
of the hybrid interpolation for the momentum equations and that of the piecewise constant interpolation 
for the continuity equation. More details about the choice of transfer operators are given in ‘ L ^ 
[31], and an efficient computation of Galerkin coarse grid approximation is presented in [29] an [ J 

for systems of equations. 
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Reduction factors are used as measure of the performance of multigrid. The average and the 
asymptotic reduction factor will be presented. Let r = ||r[|, with r = f - Kx the residual of equation (7) 
and the norm ||r|| defined by 


ii r ii = (jj E ^E - k~x);/jv?)Y , (33) 

with M the number of partial differential equations and N™ the number of grid points in £ m . At each 
time step we have a linearized equation system which is solved by a number of linear multigrid 
iterations. Let tq and r n denote respectively the residual norm before and after n cycles of multigrid 
iterations on the finest grid. The average reduction factor p is defined by 

P = (r n /r 0 )" . (34) 

The reduction factor p { at the i-th iteration is defined by 

Pi = Ti/Ti-i. 

If a limit of pi exists, then it is the asymptotic reduction factor. Define r 3 = ||r s ||, with r a = f a 
the residual of equation (10). A steady state is approximately obtained if 

r* a /r° < e < 1 (36) 

is satisfied, where is r a at time 0 and r* is r a at time t. The values of c are reported in figures 3-8. 

From the results of the following experiments, we choose the most robust method and undertake a 
further test, which aims at finding a proper choice of prolongation operators for the formulation of 
coarse grid operators. So the prolongation operators for the velocity in the momentum equations now 
use the hybrid interpolation for the velocities in the continuity equation. This specification of 
prolongation operators violates the well-known accuracy condition ([6]) for transfer operators. In [31], it 
is found that with such specification the multigrid method still works fine, the conclusion is that 
bilinear prolongation is better for low Reynolds number cases, whereas hybrid interpolation is better for 
high Reynolds number cases. With application to various test problems, which are described later, we 
perform some further experiments and try to select the best choice. 


(35) 
- K s x 


NUMERICAL EXPERIMENTS AND RESULTS 


Three test problems are chosen, which are the square driven cavity problem, the skewed driven s - 

cavity problem and the L-shaped driven cavity problem, as illustrated in figure 2. These impose various 
difficulties. For brevity, we refer to the square driven cavity problem as problem 1, the skewed driven 
cavity problem as problem 2, and the L-shaped driven cavity as problem 3. In problem 1, the grid is 
uniform Cartesian. This gives the simplest discretization, because stretched mesh cells and mixed 
derivatives do not occur. In problem 2, the grid is still uniform but the grid lines are not orthogonal, so 
mixed derivatives occur. Giving rise to more difficulties, problem 3 has a stretched non-uniform 
non-orthogonal grid. For each test problem, two Reynolds numbers are considered. Re = 1 and 
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Figure 2: The three test problems and corresponding grids: a. The square driven cavity problem, 
b. the skewed driven cavity problem; c. the L-shaped driven cavity problem; d. the computational 
domain of the L-shaped driven cavity problem 


Re = 1000. The two cases represent viscosity-dominated flows and (mostly) convection-dominated 
flows. Benchmark solutions for Re = 1000 are provided in [3], [5], [11], respectively, for problems 1-3. 
All computations are carried out on an HP-730 computer. 

Prior to the measurement of reduction factors a linear system should be specified. It is natural to use 
equation (7) at steady state (more precisely, almost steady state). For Re = 1, 20 time steps with 
At = 1 are carried out to give the matrix and the right-hand side at the ‘steady state , with each time 
step accompanied by one multigrid iteration. Only one multigrid iteration is used because we do not 
want to compute the real time history and so it is not necessary to solve the linear system at each time 
step very accurately. For Re = 1000, the number of time steps is changed to 100 with At = 0.2. The 
smoother used in the computations for the ‘steady states’ is CILU, with the underrelaxation parameter u 
fixed at 0.7. A smaller time step is needed for larger Reynolds numbers to increase the main diagonal 
because the discretization uses central differencing, which results in bad smoothing for Re and At being 
too large. Figures 3-8 present the streamlines of the test problems. They match well with the 
corresponding results in [3], [5], [1 1]. 

In order to determine the best performance of each smoother, the underrelaxation parameter is 
sampled at an interval 0.1 to find a good value. Tables 1-3 give the reduction factors for the multignd 
methods using different smoothers on the 128 x 128 grids corresponding to the best values of the 
underrelaxation factor u. If machineaccuracy is not reached, the reduction factors for the last 5 
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iterations are given, otherwise the reduction factors for the last 5 successive iterations before machine 
accuracy is reached. The maximum number of grid levels l f is also given; exceeding this causes the 
algorithm to fail. 

From these tables, we deduce the following observations. 


• The SCGS smoother works for all test cases. However, it is clearly very problem -sensitive. The 
underrelaxation parameter changes significantly as the problems and the Reynolds number change. 
For problem 3 and Re = 1000 it is very slow. For problem 2, overrelaxation has to be employed 
instead of underrelaxation. The number of grid levels must be reduced in problem 3, i.e., the 
coarse grids cannot be very coarse in order to obtain smoothing. 

• The CLGS seems to work better than the SCGS smoother, because the underrelaxation parameter 
does not vary so much and usually the reduction factors are smaller. The one exceptional case is 
problem 1, where the number of grids has to be reduced by 1, even for Re = 1. What is more 
surprising in this case is that for Re = 1000 convergence cannot be achieved with forward 
horizontal line smoothing. But using forward vertical line smoothing and strong damping, we 
recover convergence, which is, however, worse than that of the SCGS smoother. To improve the 
performance for this case, perhaps symmetrical line smoothing should be used. So further tested 
are symmetrical horizontal fine smoothing (SHCLGS), symmetrical vertical line smoothing 
(SVCLGS) and symmetrical alternating line smoothing (SACLGS). It is found that SHCLGS and 
SACLGS both do not show any improvement, because the horizontal sweeps destroy smoothing 
seriously; SVCLGS gives some improvement, giving the best average reduction p = 0.5610 for 

uj = 0 . 2 . 

• With the SILU smoother, the underrelaxation parameter is less sensitive to change of problem and 
Reynolds number than with SCGS and CLGS, but the reduction factors are usuallylarger than 
those of SCGS and CLGS. The number of grid levels cannot exceed 4 or 5, otherwise the method 
does not work due to loss of smoothing on coarser grids. The well-known dependence of ILU 
smoothers on grid point ordering plays a role in problem 3. SILU is here found to be a bad 
smoother with lexicographic grid point ordering. The results presented have been obtained with a 
backward ordering, starting from comer D (cf. figure 2d) and moving first down and then to the 
left. 

• The CILU smoother is not problem-sensitive. Very good convergence is obtained for all test 
problems. It is possible to fix the underrelaxation parameter at one value, which here is found to 
be 0.8. The dependence on the grid point ordering is pronounced for problem 3, for which the 
backward ordering described for SILU was used. 


According to the above observations, we can arrange the four smoothers in the following order from the 
best to worst: CILU, CLGS, SCGS, SILU. Of course, this conclusion is not general, because 
discretization and transfer operators both certainly affect the overall performance of an algorithm. 

Apart from robustness, efficiency should also be taken into account. Table 4 gives the CPU time in 
seconds per cycle (t c ) for the smoothers. The most robust smoother CILU takes twice as much time per 
cycle as the other three smoothers. The efficiency of two multigrid methods using two smoothers 
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(referred to as method 1 and method 2) may be compared as follows. Let the average reduction factor 
of method 1 be ft and that of method 2 be ft, and let the CPU time per multigrid cycle be ft, and ft 2 , 
respectively. For a required accuracy, for example a reduction e of the initial residual norm, method 1 
takes t c i Inc/ In pi CPU time and method 2 takes t c2 In e/ In p 2 CPU time. Define the efficiency factor 
Ef of method 1 with respect to method 2 by 


f c2 ln£i (37) 

1 td In P2 

So if Er > 1, then method 1 is more efficient; if Ef < 1 then method 2 is more efficient. For 
comparisons among more than 2 methods, one of them is used as a standard, in place of method 2. 

Using p and t c given in tables 1-4 and taking CILU as the standard for the comparison, table 5 presents 
Er in all the test cases. Bigger numbers mean higher efficiency. Apparently, The SCGS smoother and 
the CLGS smoother are mostly more, but not very much, efficient than the CILU smoother; die SILU 
smoother is mosdy less efficient. Because SCGS and CLGS can be easily altered to parallellizable 
versions by using black-white or zebra ordering, one may argue that SCGS and CLGS have more 
parallellization potential than CILU, and higher efficiency can be obtained. But this may be true only m 

two dimensions. 

Now with CILU, we investigate convergence of the multigrid method using the hybrid interpolation 
instead of bilinear interpolation for the velocities in the momentum equations in the formulation of 
coarse grid operators. The results are given in table 6 in terms of the reduction factors for the best 
values of u>. Clearly, the method works much better for Re = 1000 than for Re = 1. Using the hybrid 
prolongation for Re = 1000 the method performs equally as well as the method using the bilinear 
prolongation. It is easy to see that for low Reynolds number cases bilinear prolongation is better, but 
this is not so clear for high Reynolds number cases. We found that for high Reynolds numbers there are 
some cases in which bilinear prolongation does not work but the hybrid prolongation still works well. 
Therefore it is safer to use the hybrid prolongation for high Reynolds numbers. One may conclude again 
that the hybrid prolongation is more suitable for high Reynolds numbers and bilinear prolongation is 
more suitable for low Reynolds numbers. 


CONCLUSIONS 


The performance of the multigrid method using SCGS, CLGS, SILU and CILU smoothers are 
studied for the incompressible Navier-Stokes equations in general coordinates. Galerkin coarse grid 
approximation is used in the computation of coarse grid matrices. Both robustness and efficiency of the 
methods are investigated and measured in terms of reduction factors and efficiency factors. The results 
show that the most robust smoother is CILU; CLGS and SCGS follow. SILU is the worst. For 
efficiency, the order from the best to the worst is CLGS, SCGS, CILU and SILU. Although CILU is 
somewhat less efficient than CLGS and SCGS and it has less parallellization potential in two 
dimensions, it may be more promising in three dimensions because it is much more robust than all the 
others and parallellization can also be established among planes. 
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For prolongation operators in the computation of coarse grid operators, the hybrid interpolation is a 

more appropriate choice for high Reynolds numbers, whereas bilinear interpolation is a more appropriate 
choice for low Reynolds numbers. 
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Figure 3: Streamlines for problem 1, Figure 4: Streamlines for problem 1, 

Re = 1, r*./r® < 4.581 x 10" 11 Re = 1000, r‘/rj < 1.804 x lO' 3 





Figure 5: Streamlines for problem 2, 

Re = 1, ri/r® < 4.358 x 10" 10 


Figure 6: Streamlines for problem 2, 
Re = 1000, r*/r? < 4.484 x 10~ 6 



Figure 7: Streamlines for problem 3, 

Re = 1, r*/r® < 9.723 x 10~ 9 


Figure 8: Streamlines for problem 3, 

i*e = 1000, r*/r? < 1.172 x 10~ 4 
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Table 1: Reduction Factors Corresponding to the Best Values of u; for Problem 1 on the 128 x 128 Grid 


Smoother 

SCGS 

CLGS 

SILU 

CILU 

SCGS 

CLGS 

SILU 

CILU 

Re = 1, r Q = 12.96 

Re = 1000, r 0 = 1.605 x 10~ 02 

UJ 

0.8 

1.0 

0.9 

1.0 

0.3 

o.r 

0.7 

1.0 

h 

7 

6 

4 

7 

7 

6 

4 

7 

i 

16 

16 

16 

15 

16 

16 

16 

16 

pi 

0.2787 

0.2183 

0.3708 

0.2026 

0.6464 

0.8344 

0.8168 

0.4122 

Pi+ 1 

0.2811 

0.2184 

0.3761 

0.2079 

0.6420 

0.8950 

0.8009 

0.4116 

Pi+2 

0.2789 

0.2237 

0.3807 

0.2142 

0.5994 

0.8735 

0.8244 

r~oT4136 

Pi + 3 

0.2816 

0.2235 

0.3849 

0.2224 

0.6088 

0.9048 

0.8846 

0.4155 

Pi- f4 

0.2791 

0.2300 

0.3880 

0.2393 

0.5869 

0.8899 

0.9346 

0.4131 

p 

0.2561 

0.1973 

0.2863 

0.1732 

0.4918 

0.7773 

0.7005 

0.2996 


* Forward vertical smoothing 


Table 2: Reduction Factors Corresponding to the Best Values of u for Problem 2 on the 128 x 128 Grid 


Smoother 

SCGS 

CLGS 

SILU 

CILU 

SCGS 

CLGS 

SILU 

CILU 

Re = 1, r 0 = 25.92 

Re = 1000, T-o = 2.697 x 10~ 02 

CJ 

1.2 

0.9 

0.9 

0.8 

1.2 

1.0 

0.8 

0.9 

h 

7 

7 

4 

7 

7 

7 

4 

7 

i 

16 

16 

16 

16 

16 

16 

16 

16 

pi 

0.3377 

0.3476 

0.7401 

0.3857 

0.3693 

0.4166 

0.7784 

0.3052 

Pi + 1 

0.3406 

0.3492 

0.7418 

0.3885 

0.3650 

0.4167 

0.7798 

0.3190 

Pi+2 

0.3452 

0.3512 

0.7437 

0.3911 

0.3613 

0.4173 

0.7811 

0.3312 

Pi+2 

0.3432 

0.3536 

0.7456 

0.3931 

0.3582 

0.4180 

0.7825 

0.3399 

Pi+4 

0.3463 

0.3563 

0.7476 

0.3942 

0.3558 

0.4184 

0.7839 

0.3447 

p 

0.3032 

0.3205 

0.6315 

0.2968 

0.3472 

0.3941 

0.6716 

0.2802 
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Table 3: Reduction Factors Corresponding to the Best Values of u for Problem 3 on the 128 x 128 Gnd 


Smoother 

SCGS 

CLGS 

SILU 

CILU 

SCGS 

CLGS 

SILU 

CILU 

Re = 1, r 0 = 18.20 

Re = 1000, T-o = 1.969 x 10~ 02 

U/ 

1.0 

0.9 

o 

bo 

* 

0.8* 

0.1 

0.4 

0.2* 

0.8* 

If 

6 

7 

5 

7 

5 

7 

5 

7 

i 

16 

15 

16 

16 

16 

16 

16 

16 

pi 

0.7302 

0.2320 

0.5960 

0.6997 

0.9381 

0.6527 

0.9293 

0.3496 

p t+i 

0.7319 

0.1699 

0.5878 

0.4104 

0.9399 

0.6354 

0.9337 

0.3355 

Pi+2 

0.7334 

0.2131 

0.5914 

0.2317 

0.9400 

0.6425 

0.9376 

0.3344 

Pi + 3 

0.7347 

0.1941 

0.5927 

0.6643 

0.9383 

0.6536 

0.9411 

0.3448 

P i+4 

0.7359 

0.2614 

0.5909 

0.4450 

0.9352 

0.6386 

0.9442 

0.3292 

P 

0.5914 

0.1645 

0.4992 

0.3673 

0.7815 

0.5422 

0.8183 

0.2795 


* Backward lexicographical ordering of grid points 


Table 4: CPU Time Needed by One Multigrid 
Cycle on 128 x 128 Grid 


Smoother 

SCGS 

CLGS 

SILU 

CILU 

tc 

25.0 

23.4 

28.9 

56.3 


Table 5. The Efficiency Factor Ej for All Test Cases 


Smoother 

SCGS 

CLGS 

SILU 

CILU 

SCGS 

CLGS 

SILU CILU 

Re = 1 

Re = 1000 

Problem 1 

1.7 

2.2 

1.4 

1.0 

1.3 

0.5 

0.6 

1.0 

Problem 2 

2.2 

2.3 

0.7 

1.0 

1.9 

1.8 

0.6 

1.0 

Problem 3 

1.2 

4.3 

1.4 

1.0 

0.4 

1.2 

0.3 

1.0 
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Table 6: Reduction Factors of the Multigrid Method Using CILU and the Hybrid 
Prolongation for Various Problems on the 128 x 128 Grids 


Problem 

Problem 1 

Problem 2 

Problem 3* 

1! 

UJ 

1.0 

1.0 

0.2 

h 

7 

7 

7 

i 

16 

16 

16 

Pi j 

0.5878 

0.7986 

0.7560 

p t+i 

0.5900 

0.8028 

0.7538 

Pi+2 

0.5919 

0.8064 

0.7520 

Pi + 3 

/ 0.5935 

0.8095 

0.7504 

Pi + 4 

0.5949 

0.8121 

0.7492 

ro 

0.1296 x 10+ 02 

0.2592 x 10 +02 

0.1802 x 10+ 02 

Ti+4 

0.2396 x 10" 05 

0.9662 x 10" 03 

0.3300 x 10' 02 

p 

0.4606 

0.6006 

0.6503 

Re = 1000 

UJ 

1.1 

1.0 

0.7 

If 

7 

7 

7 

i 

16 

16 

16 

pi 

0.3732 

0.3282 

0.3286 

Pi+\ 

0.3778 

0.3159 

0.3282 

Pi+2 

0.3616 

0.3222 

0.3267 

Pi + 3 

0.3746 

0.3629 

0.3253 

Pi +4 

0.3861 

0.3837 

0.3278 

7-0 

0.1605 x 10~ 01 

0.2697 x 10" 01 

0.1969 x lO" 01 

n+4 

0.1491 x 10~ 12 

0.8231 x 10“ 12 

0.3485 x 10- 12 

P 

0.2808 

0.2980 

0.2900 


* Backward lexicographical ordering of grid points 
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